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• , . Abstract 



Submodular functions are a fundamental object of study in combinatorial optimization, 

economics, machine learning, etc. and exhibit a rich combinatorial structure. Many subclasses 
C***'" ■ of submodular functions have also been well studied and these subclasses widely vary in their 

complexity. Our motivation is to understand the relative complexity of these classes of functions. 

Towards this, we consider the question of how well can one class of submodular functions 
Xy^ \ be approximated by another (simpler) class of submodular functions. Such approximations 

^\ ■ naturally allow algorithms designed for the simpler class to be applied to the bigger class of 

functions. We prove both upper and lower bounds on such approximations. Our main results 

are: 

• General submodular functions 1 can be approximated by cut functions of directed graphs 
to a factor of n 2 /4, which is tight. 

General symmetric submodular functions 1 can be approximated by cut functions of undi- 
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■^i- ■ rected graphs to a factor of n — 1, which is tight up to a constant. 

2l ' • Budgeted additive functions can be approximated by coverage functions to a factor of 

. | e/(e — 1), which is tight. 

Here n is the size of the ground set on which the submodular function is defined. We also 
(*C) • observe that prior works imply that monotone submodular functions can be approximated by 

coverage functions with a factor between 0(y / n\ogn) and ^(n 1 / 3 /log n). 

^ ! 1 Introduction 

Submodular optimization problems have been a rich area of research in recent years, motivated 
by the principle of diminishing marginal returns which is prevalent in real world applications. 
Such functions are ubiquitous in diverse disciplines, including economics, algorithmic game theory, 
machine learning, combinatorial optimization and combinatorics. While submodular function can 
be minimized efficiently, i.e., in polynomial time [16, 23, 18], many natural optimization problems 
over submodular functions are NP-hard, e.g., Max-fc-Coverage [20], Max-Cut and Max-DiCut [14], 
and Max- Facility-Location [9]. Consequently, many works, specifically in the setting of algorithmic 
game theory [7, 10, 11, 17], have explored simpler subclasses of submodular functions for which 
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1 We additionally assume that the submodular function takes value on the null set and the universe. 



the given algorithmic problem can still be well-approximated. Such subclasses of submodular 
functions have included cut functions of graphs, coverage functions of set systems, budgeted additive 
functions, matroid rank functions, etc. 

Our work is motivated by the question, how complex can a submodular function be? Since 
this is such a fundamental question, it has been asked in different forms previously. Goemans et 
al. [13] consider how many queries to a submodular function are sufficient to infer the value of the 
function, approximately, at every point in the domain. Balcan and Harvey [2] focus on the problem 
of learning submodular functions in a probabilistic model; are few random queries enough to infer 
the value at almost all points in the domain? Badanidiyuru et al. [1] ask whether an approximate 
sketch of a submodular function, or more generally a subadditive function, exists (i.e., can the 
function be represented in polynomial space)? Seshadri and Vondrak [24] consider the testability 
of submodular functions: how many queries does it take to check whether a function is close to 
being submodular? 

We approach this question by noting that not all submodular functions are identically complex 
and some have been more amenable to optimization than others. Thus, one natural way to charac- 
terize the relative complexity of one class of submodular functions w.r.t another, is to ask how well 
can a function in the first class be approximated by a function in the second. Formally, we ask the 
following question. Given two classes of submodular functions T and Q (typically Q C J 7 ), what is 
the smallest 9, such that for every / G J 7 , there exists & g £ Q such that f(S) < g(S) < 9 ■ f(S) for 
each S C Ul Here class Q would represent the class of submodular functions which are easier to 
optimize for some problem and class J- would represent a bigger class which we want to optimize 
over. We also note that this concept of approximation is not special to submodular functions and 
can be asked for any two classes of functions. We focus on submodular functions due to their 
ubiquitous nature in optimization. 

Intuitively, this notion of approximation resembles the long and rich line of work that deals 
with the algorithmic applications of geometric embeddings, in which the goal is to embed hard 
metric spaces into simpler ones. Some successful examples include embedding general metrics into 
normed spaces [5], dimension reduction in a Euclidean space [19] and the probabilistic embedding 
into ultrametrics [4, 12]. As in the metric case, a natural byproduct of the above approach is that 
if there exists an a-approximation algorithm for any submodular function in Q, then there exists a 
{9 ■ a)-approximation algorithm for all functions in T ' . As an application of our approach, we show 
how to obtain an algorithm for the online submodular function maximization problem, for general 
monotone submodular functions [6]. Previously, results were known for only certain subclasses of 
submodular functions; see Appendix E for details. 

1.1 Our Results and Techniques 

We start by asking how well a general submodular function / : 2 — >• M + (with the additional prop- 
erty that /(</>) = = f(U)) can be approximated by a function in the canonical simpler subfamily 
of non-symmetric submodular functions, cut function of a directed graph. We give matching upper 
and lower bounds for such an approximation (Theorem 1.1). Next, we ask the same question for 
symmetric submodular functions vis-a-vis its canonical simpler subfamily, cut functions of undi- 
rected graphs. In this case, we provide nearly matching upper and lower bounds (Theorem 1.2). 
We then move our attention to two subfamilies, budgeted additive functions and coverage func- 
tions, both of which, as already mentioned in the introduction, have received considerable interest 
in the algorithmic game theory setting. We show tight upper and lower bounds for approximating 
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Table 1: Our results are described in the first three rows. The results in the last row are either 
implicit in the references or follow as a corollary (Appendix G). When the output class is a cut 
function of a graph, we assume that the input function / satisfies /(0) = f(U) = 0, as every cut 
function must satisfy this constraint. Here, n denotes the size of the ground set. 

budgeted additive functions with coverage functions (Theorem 1.3). These results are summarized 
in Table 1. While previous works [13, 2, 1] studied the complexity of submodular functions from 
different perspectives, they do imply some additional results, both positive and negative, on the 
approximation of monotone submodular functions by simpler classes of submodular functions (as 
illustrated in Table 1 and discussed in detail in Appendix G). 

Let us now briefly discuss the main techniques that we use to obtain our results. In contrast 
to previous works [13, 2, 1], arbitrary submodular functions, as opposed to monotone submodular 
functions, present different challenges. As an illustration, for approximating a submodular function 
/ via a cut function of a graph G, consider the case when there is a non-trivial set ^ S ^ U 
for which f(S) = 0. Then the weight of the cut (S,S) in G is forced to be zero. Indeed, all 
sets S with f(S) = must be in a correspondence with cuts in G of value zero. Thus, given 
a submodular function /, our construction of G optimizes for the minimizers of the submodular 
function /. Surprisingly, this can be shown to give the best possible approximation. For a symmetric 
submodular function /, we show that it suffices to use the cut function of a tree (as opposed to a 
general undirected graph) utilizing the Gomory-Hu tree representation [15, 21] of /. 

For approximating budgeted additive functions by coverage functions, we first give a randomized 
construction achieving an approximation factor of e/(e — 1). We then show that this is the best 
possible approximation factor as characterized by a linear program. The proof of the lower bound 
of e/(e — 1) uses linear programming duality and proceeds by presenting a feasible dual solution to 
the linear program achieving an objective value of e/(e — 1) in the limit. We would like to point 
out that all our results are algorithmic and the claimed approximations can be found in polynomial 
time given a value oracle for the submodular function. 



1.2 Related Work 

Goemans et al. [13] considered the problem of how well a given monotone submodular function / can 
be approximated when only polynomially many value oracle queries are permitted. They presented 
an approximation of O (y/nlogn), and an improved guarantee of \Jn + 1 in the case that / is a 
matroid rank function. Implicit in this algorithm and relevant to our setting is an approximation 
of all monotone submodular functions by budgeted additive functions (Appendix G). The current 
best lower bound for the problem studied by [13] is given by Svitkina and Fleischer [25] and is 



0(i/n/logn). 

Balcan and Harvey [2] take the learning perspective to the study of the complexity of submodular 
functions. They study the problem of probabilistically learning a monotone submodular function, 
given the values the function takes on a polynomial sized sample of its domain. They provide a 
lower bound of ^(n 1 ' 3 ) on the best possible approximation a learning algorithm can give to the 
submodular function, even when it knows the underlying sampling distribution, and the submodular 
function to be learned is Lipschitz. Another result with this perspective is by Balcan et al. [3] who 
show that a symmetric non-monotone submodular function can be approximated to within y/n by 
the square root of a quadratic function. Furthermore, they show how to learn such submodular 
functions. 

Badanidiyuru et al. [1], motivated by the problem of communicating bidders' valuations in 
combinatorial auctions, study how well specific classes of set functions can be approximated given 
the constraint that the approximating function be representable in polynomially many bits. They 
named such an approximation a sketch, proving that coverage functions admit sketches with an 
arbitrarily good approximation factors. Additionally, for the larger class of monotone subadditive 
functions, they construct sketches that achieve an approximation of y/n • polylog(n). Combining 
the results of Badanidiyuru et al. [1] and Balcan and Harvey [2], a lower bound of ^(n^/log 2 n) 
follows for the approximation of monotone submodular functions by budgeted additive functions 
(Appendix G). 

Testing of submodular functions has been studied recently by Seshadri and Vondrak [24] for 
general monotone submodular functions and for coverage functions by Chakrabarty and Huang [8] . 
The goal here is to query the function on few domain points and answer whether a function is close 
to being submodular or not. The measure of closeness is the fraction of the domain in which the 
function needs to be modified so as to make it submodular. 

1.3 Preliminaries and Formal Statement of Results 

Given a ground set U, a function / : 2 U — > M + is called submodular if for all subsets S,T C U , 
we have f(S) + f(T) > f(S U T) + f(S D T). A submodular function / is called non-negative if 
f(S) > for each S C U. In this paper, we only consider non-negative submodular functions. A 
submodular function / is called symmetric if f(S) = f(U \ S) for each S C U, and monotone if 
f(S) < f(T) for each S C T C U. We say that a class of functions Q 8 -approximates another class 
J 7 , if for every / G T there exists a g £ Q such that f(S) < g(S) < 9 • f(S) for any S C U. We 
denote by n the size of the ground set U. We now define certain subclasses of submodular functions 
that we consider in the paper. 

Definition 1 (Coverage function). A function f is a coverage function if there exists an auxiliary 
ground set Z, a weight function w : Z — )■ M + and family of subsets {A{ : A\ C Z,i G U} such that 

VSQU,f(S) = Z z ^ sAt w(z). 

Definition 2 (Budgeted additive function). A function f is a budgeted additive function if there 
exist non-negative reals a% for each i G U , and a non-negative real B such that VS 1 C U , f(S) = 

mm{B,J2 ie s a i}- 

It is well known that coverage functions and budgeted additive functions are monotone sub- 
modular functions. 



Definition 3 (Cut function). A function f is a directed cut function if there exists a directed graph 
G = (U,A) with non-negative arc weights w : A — >■ R+, such thatVS C U, f(S) = w(5 + (S)), where 
S + (S) denotes the set of outgoing arcs, with their tails in S and heads in S, and w(F) = ^2 aGF w{a) 
for any subset F C A of arcs. 

Similarly, one can define / to be the undirected cut function of an undirected graph by substi- 
tuting S + (S) with 5(S), the set of edges with exactly one endpoint in S. It is well known that cut 
functions, whether directed or undirected, are submodular. Furthermore, clearly, undirected cut 
functions are symmetric. 

Let us now formally state our main results: 

Theorem 1.1. Let f : 2 U — > R + be a non-negative submodular function with /(0) = f(U) = 0. 
Then the class of directed cut functions (n 2 / '4) -approximates f. Moreover, there exists a non- 
negative submodular function f : 2 U — > M + with /(0) = f(U) = such that any directed cut 
function cannot approximate f within a factor better than n 2 /4. 

Theorem 1.2. Let f : 2 U — > R+ be a non-negative symmetric submodular function with /(0) = 
0. Then the class of undirected cut functions (n — 1) -approximates f . Moreover, there exists a 
symmetric submodular function f : 2 U — > M + with /(0) =0 such that any undirected cut function 
cannot approximate f within a factor better than ra/4. 

Theorem 1.3. Let f : 2 — > R + be a budgeted additive function. Then coverage functions iejie — 
1)) -approximates f. Moreover, for every fixed e > 0, there exists a budgeted additive function 
f : 2 — > M + such that any coverage function cannot approximate f within a factor better than 

e/(e-l)-e. 

Theorems 1.1, 1.2 and 1.3, are proved in Sections 2, 3 and 4 respectively. 

2 Approximating General Submodular Functions by Directed Cut 
Functions of Graphs 

In this section we prove Theorem 1.1 which provides a tight approximation of a non-negative 
submodular function / using a directed cut function of a graph G. Before proving the main result 
of this section, we first state a technical lemma, whose proof we defer to Appendix C. 

Lemma 2.1. For every submodular function f, and any collection of sets A\, A^, ■ ■ ., A n C U: 

/(n?=iA)<Er=i/(A). 

We are now ready to prove Theorem 1.1. 

Proof of Theorem 1.1. 

Upper Bound: Given a submodular function /, we construct a directed graph G = (U, A) with 
non-negative weights w on the arcs such for every S C U, f(S) < w(5 + (S)) < n 2 /4 • f(S). For 
every (u,v) E U x U and u ^ v, introduce a directed arc from u to v with weight: w uv = f(T uv ) 
where T uv = argmin {f(R) : R C U,u G R,v ^ R}. We start by proving that: 

f(S) < w(5 + (S)) VSOU. (1) 



If S = U or S = <j>, then clearly w(S + (S)) = f(S) = and (1) holds. We now restrict our attention 
to the case where S, S ^ 0. For any u £ S note that u £ Ci ve gT uv , since the definition of T uv implies 
that u G T uv for all v £ S. Additionally, for any w £ S note that w ^ r\, e sT ul ,, since the definition 
of T uv implies that w $. T uw . Thus, one can conclude that U ue s Ci ve g T uv = S and therefore, 

f(S) <J2f( n v£sT uv ) <EE w ™ = ™(6 + (S)). 
ues «es V £s 

Inequality (i) is derived from the fact that / is submodular and non-negative. Inequality (ii) is 
derived from Lemma 2.1 and the definition of w uv . This concludes the proof of (1). 
We continue by proving that: 

n 

w(5 + (S))<-f(S) VSCU. (2) 

If S = U or S = <f), then clearly w(S + (S)) = f(S) = and (2) holds. We now restrict our 
attention to the case where S, S ^= 0. Note that for any u £ S and v G S, by the definition of T uv : 
f(T U v) < f(S). Thus, one can conclude that: 

u&SveS u£S V £S 

Equality (i) is by the definition of weights w uv . Inequality (ii) is be the definition of T uv . Inequality 
(iii) is by the fact that the number of pairs (u, v) G 5 x S is at most n 2 /4. This concludes the proof 
of (2). Combining both (1) and (2) concludes the proof of the upper bound of the theorem. 
Lower Bound: Assume that n is even and fix an arbitrary A C U of size \A\ = n/2. Consider the 
following function / 

f 1 ifSn,4/0,i\S/0 
\ otherwise 

Namely, f(S) is the indicator function that S hits A but does not hit all of A. A simple check shows 
that / is submodular. Let G = (U, A) be a weighted graph with non-negative weights w : A — > M + 
on the arcs whose directed cut function satisfies for each set 5 C U, f(S) < w(5 + (S)) < 6 • f(S) 

2 

for some 0. We will show that 9 > ^- proving the lower bound. 

First, we prove that the arcs with non-zero weight must go from A to A. We consider the 
following cases. 

1. Consider an edge (u, v) G Ax A. But{u,v) G 5+(U \ {v}) and w{5 + (U\{v})) <9-f(U\{v}) = 
since A\(U\ {v}) = 0. Thus, w {u>v) = 0. 

2. Consider an edge (u,v) G Ax A. But (u,v) G <5 + (A\{w}) and w(5 + (A\{v})) < 9-f(A\{v}) = 
since Af](A\ {v}) = 0. Thus, w^ v) = 0. 

3. Consider an edge (u,v) £ A x A. But (u,v) £ S + (A) and , w(5 + (A)) < 9 ■ f(A) = since 
A n A = 0. Hence, W( u>v ) = 0. 



~W(u,v) > 1 since 



Therefore, all arcs with non-zero weight must go from A to A. For any u £ A and v £ A note that 
M = w(5+ ({u} U {A \ {«}))) > / ({«} U (A \ {«})) ( = } 1. (3) 



!/' 



Inequality (i) is derived from the fact u;(<5 + («S)) > f(S) for each set S. Equality (ii) is by the 
definition of /. Furthermore, note that: 

n 2 - (') , (") Cm) 

T = \A\- \A\ < w(5 + (A)) < 6 ■ f(A) W 9. (4) 

Inequality (i) is derived from inequality (3). Inequality (ii) is derived from the fact that w(5 + (S)) < 
6f(S) for each set S C U. Equality (iii) is by the definition of /. Note that inequality (4) implies 

2 

that 6 > 2p, thus, concluding the proof of the lower bound of the theorem. ■ 

3 Approximating Symmetric Submodular Functions by Undirected 
Cut Functions of Graphs 

In this section, we prove Theorem 1.2 which provides upper and lower bounds on the approximation 
of a symmetric submodular function using an undirected cut function of a graph. For the upper 
bound, our algorithm uses Gomory-Hu trees of symmetric submodular functions [15, 21]. Given 
a symmetric non-negative submodular function /, a tree T = (U,Et) is a Gomory-Hu tree if for 
every edge e = (u, v) € Er- f(R e ) = min{/(i?) : R C U,u G R,v ^ R}, where R e is one of the 
two connected components obtained after removing e from T (since / is symmetric, it does not 
matter which one of the two connected components we choose). In other words, in a Gomory-Hu 
tree, the cut e = (u, v) induced in T, corresponds to a minimum value subset that separates u and 
v. We prove that the cut function of the Gomory-Hu tree of / is a good approximation. 

Proof of Theorem 1.2. 

Upper Bound: Let / be a symmetric submodular function. We shall construct an undirected 
tree T = (U, Ex) with non-negative weights w : E —> 1R + on the edges such that for every S C U, 
f(S) < w(S(S)) < (n — 1) • f(S). We set T to be a Gomory-Hu tree of / and let the weight of any 
edge e = {u, v} to be f{R e ) where R e is the one of the two connected components obtained after 
removing edge e. As mentioned above, the weight of edge e = {it, v} is the minimum of f(R) over 
all R separating u and v. 

Fix an arbitrary S C U and denote by {ei, . . . , e^} all the edges crossing the cut that S defines 
in T. Let T±, . . . , T^+i denote the partition of U induced by deleting the edges ei, . . . , e^ from T. 
Furthermore, denote by {Si, . . . , S p } the non-empty sets in {Tj C\ S : 1 < i < k + 1}. Observe that 
Si, ... ,S P is a partition of S. Since each ej, 1 < i < k, has exactly one vertex in S and the other in 
S, we can associate ej with a unique set from Si, ... , S p , the set containing one of the endpoints of 
e. Additionally, let us denote Fi to be the edges which are associated with set Si for each 1 < i < p. 
Clearly Fi, . . . ,F p form a partition of {ei, . . . , e^}. 

We claim that for every 1 < i < p: 

f(S l ) < Y, f( R f)- (5) 

feFi 



Recall that Rf is one of the connected component after removing edge / from T. Since Si is a 
subset of a connected component formed after removing all the edges {ei, . . . , e^} from T, it must 
be contained in one of the components formed after removing edge /gfj from T. Without loss 
of generality, we assume that Rf n Si = for each edge / G Fj. It is straightforward to see that 
C\f e p t Rf = Si. Now, we have 

E f(Rf) > f (UeeF^/) = / (iW*7) = / (rwiii/) ( = /(#). 

Inequality (i) is derived from the fact that / is submodular and non-negative. Equality (ii) is 
derived from the symmetry of /. Equality (hi) is derived from the fact that (If^Rf = Si. 
We start by proving that: 

f(S) < w(5(S)) VS C U. (6) 

This can be proved as follows: 

w(5(s)) { i J2 f( R e t ) = E E / w C S } E /(so ( ? / ( u -i^) = /( 5 )- 

Equality (i) is by the definition of edge weights in T. Equality (ii) is by the fact that iq, .. .,F p 
form a partition of {ei,...,efc}. Inequality (hi) is derived from inequality (5). Inequality (iv) is 
derived from the fact that / is submodular and non- negative. This concludes the proof of (6). 
We continue by proving that: 

w(6(S))<(n-l)-f(S) V5CC7. (7) 

Let u and v be the endpoints of edge ej and without loss of generality assume that u £ S and 
v $ S. Note that for every 1 < i < k, f(R ei ) < f(S) since 5 is a candidate set separating u and v. 
Hence, one can conclude that: 

(W JU (») (i«) 

v>MS)) = E /(^) ^ k ■ fW <(»-!)• /(^)- 

Equality (i) is by the definition of edge weights in T. Inequality (ii) is what we proved above, and 
inequality (hi) is derived from the fact that T contains at most n — 1 edges, thus, k < n — 1. This 
concludes the proof of (7). Combining both (6) and (7) concludes the proof of the upper bound of 
the theorem. 
Lower Bound: Refer to Appendix A. ■ 

4 Approximating Budgeted Additive Functions by Coverage Func- 
tions 

In this section, we present matching upper and lower bounds for approximating budgeted additive 
functions by coverage functions (Theorem 1.3) 2 . The following lemma from Chakrabarty and 
Huang [8] provides the alternate representation of coverage functions used in our proof of the lower 
bound. 



"It is easy to show that a coverage function can be written exactly as a sum of budgeted additive functions. 



Lemma 4.1. [8] A function f : 2 U — 
X T > /or each T C [/ suc/i £/m£ /(5) 



IR+ is a coverage function if and only if there exist reals 
X^T:Tn5^0 °°T for each S CU. 



Proof of Theorem 1.3. 

Upper Bound: Refer to Appendix B. 

Lower Bound: We will construct a budgeted additive function which cannot be approximated 

by coverage functions to factor better than — §y — e for any e > 0. We will consider the family of 

budgeted additive function fk, parameterized by the size of domain \U\ = n they are defined on, 

where n = k 2 for some integer k. Under fk, all n = k 2 items have value one and the budget is k. 

Please note that these also constitute a family of uniform matroid rank functions. Therefore, 



fk(S) 



1 5'| 
k 



if \S\ < k 
o.w. 



(8) 



Let hk be a coverage function that gives the maximum value of (3 such that \/S C [n], f3-fk(S) < 
hk(S) < fk(S) and ak be the value of /3 as given by hk- Observe that here function hk is always 
smaller than the function fk- The function -# would give a 4 -approximation for approximating 
function fk- This slight change in notation helps for exposition below. We shall show that as 
k — > oo, ak tends to a value that is at most 1 — 1/e. This shall prove our claim. 

Using Lemma 4.1, we note that ak can be characterized by a solution to a linear problem (P) 
given below. Here, the variables are xy, one for each set T C U. The dual (D) of this linear 
program is given alongside. We will construct a dual solution of value approaching 1 — as k — > oo. 
Since every feasible dual solution is an upper bound on ak, the result follows. 



max ak 
subject to 

VSCU, J2 x T<fk(S) 

VSCU, a k fk(S)- ]T x T <0 

Tns^<f> 

VS^U, x s >0 



(P) 



mmJ2fk(S)-u S (D) 

scu 

subject to 
VSCU, Y^ ( U T -v T )>0 

J2fk(S)-v s >l 



scu 



Since, /(•) is symmetric across sets of the same cardinality, we can assume, without loss of 
generality, that the optimal dual solution is also symmetric. Specifically, the values of the dual 
variables ut and vt shall depend only the cardinality \T\. Let us write the symmetrized dual 
program. 



k 



mm > j ■ 

j=i VJ > j=k+i 



J ■ Uj + / & • I ) • Uj Symmetrized Dual Program 

subject to 



k 



n\ \-^ . /n 



5>( 7 -H + E fe-L- -^>i (io) 

i=i xj/ j=k+i v/ 

Let Cj denote the coefficient of v k in the equation corresponding to set size j, i.e., Cj = (J.) — { n k J ) ■ 
Further, define Acj = Cj+\ — Cj. 

We give the following solution to the dual linear program. Let v k = (n \ , , ui = Aq. • v k and 

u n = (cfc — k ■ Ac/c) • v k . Rest of the variables are set to zero. 

We first show that the above solution is feasible for the dual and has objective value that tends 
to 1 — 1/e as k — > oo. It is easy to see that with the proposed setting of v k , Equation (10) is satisfied. 
To show that Equation (9) is satisfied, we show that Vj G [n], j ■ u\ + u n > ((^) — { n k ' , ))vk- Using 
our notation, it suffices to show that for all j € [n], (j — k) ■ Ac k + c k > Cj. 

Claim 4.2. Cj is an increasing function of j and Acj = Cj+i — Cj is a decreasing function of j. 

proof a Cj = Ci+1 - C , =((;)_ rr 1 )) -(G)- (v)) = (y) - m = r^ 1 )- ■ 

Claim 4.3. For all j £ [n], (j — k) ■ Ac k + c/% > Cj. 

Proof. • For j = k + i such that < i < n — k: LHS = i ■ Ac k + Cfc > c^+j = RHS 

• For j = k — i with < i < k: LHS = c k — i ■ Ac k > Ck — J2q=k-i Ac g = c k -i ■ v k = RHS 
where the second inequality in both the cases follows because Acj is a decreasing function in 
j- ■ 

Let us now bound the value of the dual objective function. Look at the value that the dual 
objective function attains with this setting of variables. The function value is n ■ u\ + k ■ u n = 
(n ■ Acfc + k ■ (cfc — k ■ Ac k )) • v k . Since n = k 2 , the dual objective value is equal to 

/k 2 -k\ 

k-c k -v k = 1- A ■ 
\k) 

This quantity tends to 1 — 1/e as k — > oo. ■ 
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5 Future Directions 

We mention here a couple of main research questions that are left open by the present work. The 
first is how well a non-negative monotone submodular function can be approximated by the sum of 
matroid rank functions. Dughmi et al. [11] show that the Hessian matrix for a matroid rank sum 
has to be negative semi-definite, and it is easy to come with a budgeted additive function that does 
not obey this property. Hence, we cannot hope for the best approximation factor for a submodular 
function by a matroid rank sum to be 1; in fact we can show that the approximation factor cannot 
be better than some constant bounded away from 1. In terms of positive results, a 0{y/{n)) factor 
approximation follows from [13] and a 0( maXeeu 1(1 ) follows from a result in Section 44.6(B) in 
[22]. 

The second is approximating a non-negative symmetric submodular function by a hypergraph 
cut function (in this paper, we only considered graph cut functions). The lower bound example 
in the paper for graph cut functions can be extended to show that a r-regular hypergraph cannot 
approximate to a factor better than O(j). In terms of positive results, we know no better than the 
ones mentioned in this paper. 
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A Lower Bound of Theorem 1.2 

Lower Bound of Theorem 1.2. Consider the following symmetric submodular function /: 

1 if S ^ 0, U 



f(S) 

otherwise 

Let G = (U,E) be an edge weighted graph with non- negative weights w : E — > R + on the edges 
whose cut function satisfies f(S) < w(5(S)) < 8 ■ f(S) for each set S C U for some 8. We will show 
that 8 > \. 

For any vertex v £ U, 1 = f({v}) < w(5({v})). Thus, the total weight of edges in G is at least 
2 Sijgc/ w (3 ({ v })) ^ §• Every undirected graph has a non-trivial cut that contains at least half 
the total weight of edges in the graph, thus, there exists a cut S C U, S ^ ®,U, where w(5(S)) > \. 
The existence of such a cut can be shown by picking a cut at random where each vertex is in S with 
probability ^ independently. The expected weight of the cut will be exactly half the total weight 
of all edges. Now, we have f < w(6(S)) < 8 ■ f(S) = 8, concluding the proof of the lower bound of 
the theorem. ■ 

B Upper Bound of Theorem 1.3 

Upper Bound of Theorem 1.3. Consider any budgeted additive function /(•) over some domain U, 
with budget B and the values of the elements be denoted by v\, V2, • • • v n where n = \U\. Without 
loss of generality, we assume all these values to be integers. Take an auxiliary ground set G of size 
B. For each i £ U, construct a set Ai C G, formed by choosing Vi points (with replacement) at 
random from G. Consider function g : 2 — > Z, defined as g(S) = \ Ujgs Ai\ for all S C U. 

By definition, g(-) is a coverage function. Furthermore, it is easy to see that for all S C U, 
g(S) < f{S). We now show that E[p(5)] > (1 — 1/e) • f(S), where the expectation is taken over the 
randomness of the procedure described to construct g(S). Note that </(•) = E[<?(-)] is a coverage 
function. Consider any set S C U. Let f(S) = V, i.e., Ylii&s v i- = V- Consider the case when 
V < B. Consider any point in auxiliary ground set G. The probability that this point is not 
covered by any of the sets Ai for i G S is at most (1 — 1/B) V . Hence, the expected value of 
\U ieS Ai\ is at least B-(1-(1-1/B) v ) > B-{l-e~ v / B ) > (l-l/e)-V. Here we use the inequality 
1 — e~ x > (1 — 1/e) • x. Hence, | Di^s Ai\ > (1 — 1/e) • f{S). The proof for the case when V = B is 
similar. Thus for each set S C[/, we have 

(l-- e )f(S)<g'(S)<f(S). 
Thus, we obtain the function -§r</(-) approximates / within a factor of — §j. ■ 

C Proof of Lemma 2.1 

Proof of Lemma 2.1. The following inequalities are derived from the definition of submodularity: 
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f(A 1 ) + f(A 2 ) > f{A l n A 2 ) + f(A 1 U A 2 ) 
f(A 3 ) + f(A 1 n A 2 ) > f(A 1 nA 2 n A 3 ) + f((A 1 n A 3 ) U A 3 ) 

f(A n ) + /(n^A,) > /(ntiA) + /((n^A*) u A n ) 

Summing up the above inequalities and canceling common terms on the two sides and using 
the fact that / is non-negative we obtain that YH=i /(A) > /(nf =1 Aj). ■ 

D Uniform Submodular and Matroid Rank Functions 

Definition 4 (Uniform Submodular Function). A submodular function is said to be uniform if the 

value it takes on a set depends only on the cardinality of the set. 

Lemma D.l. Any non-negative, integer-valued, monotone, uniform, submodular function is 1- 
approximated by a sum of uniform matroid rank functions, and hence by a sum of budgeted additive 
functions. 

Proof. Consider an integer- valued, non-negative, monotone, uniform, submodular function /(•) 
over the universe [n] and let fk be the value / takes for sets S of cardinality k. Consider uniform 
matroid rank functions g\ , g 2 , • • • g n where 

We claim that there exist a set of c^'s, such that Oj > for all i € [n] such that 

k n 

Vj G [njjj = Y^ai-j+ Y, a i-i ( 12 ) 

«=1 i=k+l 

It is easy to see that the above claim implies that f(S) = ^ on ■ gi(S) for all SC [n]. 

Now, we prove the claim. If, for every j £ [n — 1], we substract equation j from j + 1, we get 
the following set of equations 

Vj€[n-l],f j+1 -fj= Y a i ( 13 ) 

i=j+i 

From here, we can see that the following assignments to Oj is a valid solution to the above 
equations. Set a\ = 2 ■ / 2 - /i, a n = f n - / n _i and for i £ {1, n}, set a^ = 2 ■ fi - f i+1 - /j_i. All 
the ctj's are positive since / is monotone and / is submodular. 

Finally, it is easy to see that every uniform matroid rank function is also a budgeted additive 
function with all elements having value one, and the budget equal to the rank of the matroid. ■ 
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E Application to Online Submodular Function Maximization 

We consider the problem of online submodular function maximization as studied by Buchbinder et. 
al [6]. We are given a universe U and matroid M = (U, X). In an online manner, at each step for 
1 < i < m, we are given a monotone submodular function /j : 2 — > M+. The goal is to maintain 
an independent set Fi € X at any step i such that Fi C Fi + \. The objective value to maximize is 
Y^JiLx fi(Fi)- As in the notion of competitive analysis, any algorithm is compared to the best offline 
optimum max 0e x YJILi fi(P). 

Buchbinder et. al [6] give a 0(log n log m log / m tj )-competitive algorithm when each of the 
submodular function is weighted matroid rank function where f ra tio = — — max ' //j rr - In 

particular, the result applies when each of the functions fi is a coverage function. 

Using the fact every monotone submodular function can be approximated by a coverage function 
to a factor of 0(y/n\ogn), we directly obtain the following corollary. 

Corollary E.l. There is a 0(y/n\og nlogmf ra tio)-competitive online algorithm for the online 
submodular function maximization problem when each of the submodular functions is an arbitrary 
monotone submodular function. 

F Approximating Monotone Submodular Functions by Coverage 
Functions and by Budgeted Additive Functions 

The two main results of the section are the following. 

Theorem F.l. Coverage functions can approximate every non-negative monotone submodular 
function to within a factor O ( *Jn log n) . Additionally, the class of coverage functions cannot ap- 
proximate every non-negative monotone submodular function to a factor within o ( n 2 I . 

Theorem F.2. The class of sum of budgeted additive functions can approximate every non-negative 
monotone submodular function to a factor within (y/n\ogn). Additionally, the class of sum of 
budgeted additive functions cannot approximate every non-negative monotone submodular function 
to a factor within o ( -p^ 



IoeC n 



F.l Upper Bound 

For the upper-bound, we show that budgeted additive functions can -y/nlog(ra)-approximate the 
class of non-negative, monotone, submodular functions. Then we use Theorem 1.3 to infer that the 
coverage functions too can give approximately the same guarantee. 

Lemma F.3. The class of sum of budgeted additive functions can approximate every non-negative 
monotone submodular function to within a factor (^/nlogn). 

The following corollary follows easily from Lemma G.3 and Theorem 1.3 

Corollary F.4. The class of sum of coverage functions can approximate every non-negative mono- 
tone submodular function to factor O (i/nlogn). 
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The proof of Lemma G.3 follows from Lemmas G.5 and G.6. Lemma G.5 [13] gives a particular 
function that A /nlog(n)-approximates a general monotone, non-negative, sub-modular function, 
and Lemma G.6 implies that this approximating function can be written as a sum of budgeted 
additive functions. 

Lemma F.5. [13] For every monotone submodular function f : 2 — > i?+, there exists positive 
reals a e for each e G E such that g : 2 U — >• R + defined as g(S) = y/Ylee.s Ge > approximates f within 
factor y/nlog(n). 

Lemma F.6. Every submodular function f : 2*- n ' — > Z + of the form f(S) = <?(^ies a *); w here 
a,i G Z + and g is a non-negative, monotone, concave and integer valued on integral inputs, can be 
written as a sum of budgeted additive functions. 

Proof. Let m = Y17=i a «- Consider the function h, over the domain [m], defined as h(S) = g(\S\) for 
all S C [m\. Since g(-) is a non-negative, monotone, concave function, it is easy to verify that h(-) 
is a non-negative, monotone, submodular function. Construct n mutually disjoint sets Ai C [m] 
such that \Ai\ = Oj. Clearly, VS 1 C [n],f(S) = /i(Uj S 5^4j). 

From Lemma D.l, we know that h(-) can be expressed as X^i^i a i ' ^i(') where each t{ is a 
uniform matroid rank function with rank i, over the domain [m] and each on > 0. 

This implies that for all S C [n], f(S) = Y^lLi a i ' U(^ieS-^i)- Now, for every i G [m], construct 
the budget additive function t' { , defined as VS C [n],ti(S) = min{^2 ieS Vij,Bi}, where for j G [n], 
the value Vij = a,j and the budget Bi is i. Since t- L is a uniform matroid rank function of rank i and 
Aj's are mutually disjoint, we have for all i G [m], 

VSC[n},t' i (S) = t i (U ieS A i ) (14) 

Therefore, we get for all sets S Q [n], f(S) = YaLi t'^S). ■ 

F.2 Lower Bound 

For the lower bound, we first show that sum of coverage functions cannot approximate the class of 
monotone, submodular functions well, and then use Theorem 1.3, to infer that, therefore, even the 
class of sum of budgeted additive functions cannot approximate a monotone submodular function 
well. 

Lemma F.7. The class of sum of coverage functions cannot approximate every non-negative mono- 
tone submodular function to a factor within o I n 5 J . 

An easy corollary of Lemma G.7 that follows from Theorem 1.3 is the following. 

Corollary F.8. The class of sum of budgeted additive functions cannot approximate every non- 
negative monotone submodular function to a factor within o I " 3 I . 

We now present the proof of Lemma G.7. We will need to use results from [1] and [2], for which 
we first present a definition. 

Definition 5. A (3-sketch of a function f : 2 — >• R is a polynomially sized (in \U\ and 1/(1 — ($)) 
representable function g such that MS C U , (3 ■ f(S) < g(S) < f(S). 
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The following result is from [1]. 

Lemma F.9. [1] Coverage functions allow from arbitrary well sketches i.e., for any e > 0, there 
exists a 1 — e sketch. 

The following result is from [2]. It gives a 'large' family of matroid rank functions, such that 
any two functions in the class have at least one point where the values that they take differ by a 
'significant' factor. 

Lemma F.10. [2] For any k = 2°( n ' , there exists a family of sets A C 2™ with \A\ = k and a 
family of matroids M = {Mg\B C A} such that for all B C A, it is the case that 

U a c- a (o\ J Slog A; if S G B . . 

VS G A, r Mg (S) = | nl/3 ifS( £ B ( 15 ) 

where tm b ^ s the rank function of the matroid Mb 

of Lemma G. 7. Let the class of matroid rank functions on the domain of size n be a-approximable 
by coverage functions, for some a. That is, for a domain [n], for all matroid rank function r, there 
exists a coverage function g such that V5 C [n],r(S) < g(S) < a ■ r(S). 

By Lemma G.9, for every e > and every coverage function g, there exists a polynomially sized 
(polynomial in n and 1/e) representable function h such that \/S C [n], (1 — e) • g(S) < h(S) < g(S). 
Hence, for all e > and for all matroid rank functions r, there exists a polynomial sized representable 
function h such that VS C [n], r(S) < h(S)/(l — e) < -^^ ■ r(S). For any given e > 0, there are only 
20(n,i/e) 1XLan y different h functions. 

From Lemma G.10, for k = 2 log w, there exists family of sets A C 2^ with \A\ = k, and a 2 k 
sized matroid family Mb such that for all sets Asi and \/B C .A, 



81og 2 n if^GS 
n x /3 if s i B 



VSeA,r MB (S) = \ 1/3 S tf0 ^ D (16) 



Now while the number of different </ functions are 2 °( n ' 1 / £ ) ) the number of different matroid rank 
functions in this family is 2 n . Hence, by pigeon-hole principle, there must be two matroids 
B and B' (B ^ B') such that the best coverage functions g and g' approximating B and B' 
respectively, have the same best polysized representation h. But since, for every set S G BAB 1 , 
tm b an d r M B ,i differ by a factor of Q(n 1 ' 3 /log (n)), therefore, h cannot approximate at least one 
of these two to a factor better 0(n 1 ' 3 /log (n)). Since the value of g and g' at any point in the 
domain is off from that of h by at most 1 — e, and hence it follows that a = Q(n 1 ' 3 /log (n)). ■ 
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G Approximating Monotone Submodular Functions by Coverage 
Functions and by Budgeted Additive Functions 

The two main results of the section are the following. 

Theorem G.l. Coverage functions can approximate every non-negative monotone submodular 
function to within a factor O (y/n\ogn). Additionally, the class of coverage functions cannot ap- 
proximate every non-negative monotone submodular function to a factor within o ( n 2 I . 

Theorem G.2. The class of sum of budgeted additive functions can approximate every non-negative 
monotone submodular function to a factor within (-^/nlogn). Additionally, the class of sum of 
budgeted additive functions cannot approximate every non-negative monotone submodular function 
to a factor within o 



log^ 

G.l Upper Bound 

For the upper-bound, we show that budgeted additive functions can ■ v /nlog(n)-approximate the 
class of non-negative, monotone, submodular functions. Then we use Theorem 1.3 to infer that the 
coverage functions too can give approximately the same guarantee. 

Lemma G.3. The class of sum of budgeted additive functions can approximate every non-negative 
monotone submodular function to within a factor (-^/nlogn). 

The following corollary follows easily from Lemma G.3 and Theorem 1.3 

Corollary G.4. The class of sum of coverage functions can approximate every non-negative mono- 
tone submodular function to factor O (\/nlogn). 



The proof of Lemma G.3 follows from Lemmas G.5 and G.6. Lemma G.5 [13] gives a particular 
function that y / nlog(n)-approximates a general monotone, non-negative, sub-modular function, 
and Lemma G.6 implies that this approximating function can be written as a sum of budgeted 
additive functions. 

Lemma G.5. [13] For every monotone submodular function f : 2 — > R + , there exists positive 
reals a e for each e G E such that g : 2 U — > R+ defined as g(S) = A/^eeS ° e ' a PP rox i ma tes f within 
factor y/n\og(n). 

Lemma G.6. Every submodular function f : 2^ n > — y Z + of the form f(S) = g(^2i^s a i)> where 
di G Z + and g is a non-negative, monotone, concave and integer valued on integral inputs, can be 
written as a sum of budgeted additive functions. 

Proof. Let m = YH=\ a i- Consider the function h, over the domain [m], defined as h(S) = g(\S\) for 
all S Cj [m]. Since g(-) is a non-negative, monotone, concave function, it is easy to verify that h(-) 
is a non-negative, monotone, submodular function. Construct n mutually disjoint sets Ai C [m] 
such that \Ai\ = a^. Clearly, VS C [n],f(S) = /i(Uj e s^4j). 

From Lemma D.l, we know that h(-) can be expressed as YlT^i Q * ' ^i(') wriere each ti is a 
uniform matroid rank function with rank i, over the domain [m] and each on > 0. 

This implies that for all S C [n], f(S) = YllLi a i " Ui^i&S^h)- Now, for every i £ [m], construct 
the budget additive function t\, defined as V5 C [n],ti(S) = min{^ ie5 Vij,Bi}, where for j G [n], 
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the value Vij = a,j and the budget Bi is i. Since £.; is a uniform matroid rank function of rank i and 
Ai's are mutually disjoint, we have for all i £ [m], 

ySC[n],t' i (S)=t i (U i&s A i ) (17) 

Therefore, we get for all sets S C [n], f(S) = YH=i %{&)- ' 

G.2 Lower Bound 

For the lower bound, we first show that sum of coverage functions cannot approximate the class of 
monotone, submodular functions well, and then use Theorem 1.3, to infer that, therefore, even the 
class of sum of budgeted additive functions cannot approximate a monotone submodular function 
well. 

Lemma G.7. The class of sum of coverage functions cannot approximate every non-negative mono- 
tone submodular function to a factor within o i 1 



An easy corollary of Lemma G.7 that follows from Theorem 1.3 is the following. 

Corollary G.8. The class of sum of budgeted additive functions cannot approximate every non- 
negative monotone submodular function to a factor within o ( n g 

We now present the proof of Lemma G.7. We will need to use results from [1] and [2], for which 
we first present a definition. 

Definition 6. A (3-sketch of a function f : 2 — y R is a polynomially sized (in \U\ and 1/(1 — /3)) 
representable function g such that \/S C U, j3 • f(S) < g(S) < f(S). 

The following result is from [1]. 

Lemma G.9. [1] Coverage functions allow from arbitrary well sketches i.e., for any e > 0, there 
exists a 1 — e sketch. 

The following result is from [2]. It gives a 'large' family of matroid rank functions, such that 
any two functions in the class have at least one point where the values that they take differ by a 
'significant' factor. 

Lemma G.10. [2j For any k = 2°( n ', there exists a family of sets A C 2^ n > with \A\ = k and a 
family of matroids M = {Mg\B C A} such that for all B C A, it is the case that 



81og£; ifSeB 
n 1 / 3 ifS<£B 



VS e A,r MB (S) = \ °J%» H:„ (18) 



where tm b ^ s the rank function of the matroid Mb 

of Lemma G. 7. Let the class of matroid rank functions on the domain of size n be a-approximable 
by coverage functions, for some a. That is, for a domain [n], for all matroid rank function r, there 
exists a coverage function g such that V5 C [n],r(S) < g(S) < a ■ r(S). 

By Lemma G.9, for every e > and every coverage function g, there exists a polynomially sized 
(polynomial in n and 1/e) representable function h such that \/S C [n], (1 — e) ■ g(S) < h(S) < g(S). 
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Hence, for all e > and for all matroid rank functions r, there exists a polynomial sized representable 
function h such that V5 C [n], r(S) < h(S)/(l — e) < ^^ • r(S). For any given e > 0, there are only 
20(n,i/e) man y different h functions. 

From Lemma G.10, for k = 2 log W, there exists family of sets A C 2t n l with |^4| = fc, and a 2 fc 
sized matroid family A4b such that for all sets A € A and V2? C A, 



81og 2 n if^GS 
n x /3 if s i B 



VSeA,r MB (S) = \ 1/3 S ica Z D (19) 



Now while the number of different g' functions are 2°' n,1 / e ), the number of different matroid rank 
functions in this family is 2 n . Hence, by pigeon-hole principle, there must be two matroids 
B and B' [B ^ B') such that the best coverage functions g and g' approximating B and B' 
respectively, have the same best polysized representation h. But since, for every set S £ BAB', 
tm b and VM B n differ by a factor of Q(n 1 ' 3 /log 2 (n)), therefore, h cannot approximate at least one 
of these two to a factor better 0(n 1 ' 3 /log (n)). Since the value of g and g' at any point in the 
domain is off from that of h by at most 1 — e, and hence it follows that a = Q(n 1 ' 3 /log (n)). ■ 
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